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ABSTRACT 

We construct a stellar cluster catalog for the Panchromatic Hubble Andromeda Treasury (PHAT) 
survey using image classifications collected from the Andromeda Project citizen science website. We 
identify 2,753 clusters and 2,270 background galaxies within ~0.5 deg 2 of PHAT imaging searched, or 
~400 kpc 2 in deprojected area at the distance of the Andromeda galaxy (M31). These identifications 
result from 1.82 million classifications of ~20,000 individual images (totaling ^7 gigapixels) by tens of 
thousands of volunteers. We show that our crowd-sourced approach, which collects >80 classifications 
per image, provides a robust, repeatable method of cluster identification. The high spatial resolution 
Hubble Space Telescope images resolve individual stars in each cluster and are instrumental in the fac¬ 
tor of ~6 increase in the number of clusters known within the survey footprint. We measure integrated 
photometry in six filter passbands, ranging from the near-UV to the near-IR. PHAT clusters span 
a range of ~8 magnitudes in F475W (g-band) luminosity, equivalent to ~4 decades in cluster mass. 

We perform catalog completeness analysis using >3000 synthetic cluster simulations to determine 
robust detection limits and demonstrate that the catalog is 50% complete down to ~500 M 0 for ages 
<100 Myr. We include catalogs of clusters, background galaxies, remaining unselected candidates, 
and synthetic cluster simulations, making all information publicly available to the community. The 
catalog published here serves as the definitive base data product for PHAT cluster science, providing 
a census of star clusters in an L* spiral galaxy with unmatched sensitivity and quality. 

Subject headings: catalogs — galaxies: individual (M31) — galaxies: star clusters: general 


1. INTRODUCTION 

Observations of our Local Group neighbor, M31, 
present the best opportunity for a detailed yet compre¬ 
hensive study of a large spiral galaxy, providing a lo¬ 
cal analog to the disk-dominated systems that populate 
wide-field galaxy surveys. While the Milky Way allows 
analysis at the highest level of detail, studying our host 
galaxy on the whole proves difficult due to distance am¬ 
biguities and large amounts of dust attenuation within 
the Galactic plane. Conversely, studying galaxies be¬ 
yond the local group necessitates a substantial decrease 
in data quality and content due to reduced spatial reso¬ 
lution and rising photometric completeness limits. 

Similarly, Andromeda is an excellent target for obtain¬ 
ing a big picture view of a galaxy’s stellar cluster pop¬ 
ulation. While many extragalactic cluster samples ex¬ 
ist, each offering galaxy-wide coverage unattainable in 
the Milky Way, M31’s proximity provides a number of 
sensitivity-based advantages. Using the power of the 
Hubble Space Telescope (HST), we can obtain a census of 
Andromeda’s star cluster population that extends deep 
into the low-mass regime while simultaneously resolving 
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individual stars within each cluster. The ability to re¬ 
solve individual stars also allows for thorough analysis of 
M31’s field star populations, leading to detailed compar¬ 
isons of field and cluster populations, enabling studies of 
cluster formation and dissolution in the context of the 
galaxy’s overall star formation activity. 

The Panchr omatic Hubble Androm eda Treasury sur¬ 
vey (PHAT; Dalcanton et al. 20121 provides contigu¬ 
ous, high spatial resolution imaging of approximately 
one-third of the M31 disk using the HST, observed in 
six broadband passbands that span from the near-UV to 


the n ear-IR. The Year 1 cluster catalog (Johnson et al. 


2012j hereafter, Paper I) presented cluster results from 


the first 20% of the survey data. In this paper, we present 
a final, survey-wide cluster catalog created through a 
crowd-sourced, visual search of the data. The contribu¬ 
tion of citizen scientists to astrono mical research i s not 
novel: projects such as Galaxy Zoo (Lintott et ajj2008 


Willett et al. 2013), the Milky Way Project ( Si mpson 
et ai.||2012|), and Planet Hunters (Schwamb et al. 2012|) 


have previously made use of crowd-sourcing. In this work 
we analyze image classifications collected from the An¬ 
dromeda Project, a website established explicitly for the 
identification of star clusters in the PHAT dataset. 






























2 


We utilize these data to assemble a cluster catalog that 
reaches cluster masses below 10 3 M 0 . This level of cat¬ 
alog completeness represents a significant extension to 
previous ground-based studies of M31, which mainly fo¬ 
cused on old m assive globular c l usters , presented in the 


compilations of Caldwell et al. (2009 and updates via 


lo£] (RBC; 
gust to v5), 


Galleti et al. 

20041 

Huxor et al. ( 

20lTf 


2004[ last updated 2012 Au- 


efforts in M31 by 

Williams & Hodge ( 

2001) and the se- 

ries of Hodge-Krienke catalogs (Krien 

ke & Hodge 

2007 

2008, 201.3j|Hodge et al. 2009, 20101 hereafter collectively 


tion imaging allows for the identification of less massive 
clusters through its ability to differentiate between sin¬ 
gle stars and compact clusters, but previous HST-based 
studies were limited to isolated targeted observations. In 
contrast, PHAT’s contiguous wide-area coverage allows 
us to study cluster populations across the entire north¬ 
east quadrant of M31. 

The catalog presented here serves as the basis for fu¬ 
ture work that will further characterize the sample: ba- 
sic cluster parameter determinations (age , mass, Ay; 
|Beer man et al.|[2012| |Fouesnea u et al.||2014| Beerman et 
ah, in prep), spatial profiles (Fouesneau et al., in prep), 
and comparison to spectroscopically-derived properties 
of the globular cluster population (Caldwell et ah, in 
prep). Once characterized, the star clusters presented 
here will be used as input for a variety of explorations by 
the PHAT collaboration and others. As part of PHAT, 
we will place constraints on the high-mass stellar ini¬ 
tial mass function (D. Weisz et ah, in prep), and mea¬ 
sure cluster formation efficiency throughout the galac¬ 
tic disk (L.C. Johnson et ah, in prep ) to test theoreti¬ 
cal model predictions (Kruijssen 2012). Further, we will 
constrain cluster dissolution time scales (M. Fouesneau 
et al., in prep) in an effort to differentiate between com¬ 
peting mod els (mass-dependent versus mass-independent 
dissolutio n; jFa.ll et al.| 2009 j [Boutloukos fe Lamers| 2003; 


Bastian et al.||2012| |Chandar et al.||2010b|). 

We begin with a description of the citizen science web¬ 
site and data in Section [2j Section [3] discusses the pro¬ 
cess of converting contributions from citizen scientists 
into a catalog of objects, while Section 0] characterizes 
the make-up and completeness of the final catalogs. We 
present our cluster catalog and accompanying integrated 
photometry in Section[5] Section[6]includes a comparison 
of the current catalog with our previous Year 1 work and 
a discussion of how this cluster sample fits within the con¬ 
text of other well-known cluster catalogs. We conclude 
with a summary of our work in Section [7] Through¬ 
out this work, we assume a distance modulus for M31 of 
24.47 (785 kpc; McConnachie et al.|2005), where 1 arcsec 
corresponds to a physical size of 3.81 pc. 

1.1. Cluster Definition 

A star cluster can be defined in the most general sense 
as a grouping of stars that are spatially and temporally 
correlated. Beyond this broad definition, the notion of 

^ http://www.cfa.harvard.edu/oir/eg/m31clusters/M31_Hectospec. 
html 

^ http://www.bo.astro.it/M31/ 


a star cluster can vary significantly, depending mostly 
on whether the system is s till embedded in its natal gas 
or exposed (|Lada & Lada||2003|). Older (>10-30 Myr) 
gas-free systems are relatively straightforward to classify 
using a criterion based on the gravitational boundedness 
of individual members to a larger group. In contrast, 
young groupings of stars that are still embedded within 
the ISM make classification a difficult, uncertain task. 
These embedded clusters are still forming through lii - 
erarchical merging of sub-clumps (|Allison et al.||2010|), 
and the application of various stellar density thresholds 
to identify distinct features of a continuous (s cale-free) 
distribution leads to interpre tative challenges (|Bressert| 
et al.pOlO; |Gieles et al.| [2012). Embedded environments 
are dynamically evolving ana membership within a par¬ 
ticular gravitational grouping is neither well-defined nor 
unique. 

For the PHAT cluster catalog, we work mostly in the 
exposed, gas-free regime because our identification is 
based on optical imaging. Once the gas has been ex¬ 
pelled from a star cluster and its stars have evolved 
through multiple dynamical times, it becomes possible 
to infer whether a grouping of stars is either gravita¬ 
tionally bound or ex panding and dissolving (Gieles & 
Portegies Zwart[2Oil). Therefore, uncertainties pertain- 


ing to boundedness are minimal for our sample because a 
majority of PHAT clusters are already many dynamical 
times old, as inferred from the age and mass distributions 


20141. 


of the Year 1 catalog (Fouesneau et al. 

At young ages (<10 Myr), the use of boundedness as 
a selection criterion for clusters becomes difficult. Due 
to the similar appearance (i.e., radial spatial profile) 
of bound clusters and unbound stellar associations at 
young ages, determining boundedness be comes an u ncer- 

see IChandar et al. 

TFT 


tain and contentious enterprise (e.g. 


2010a Bastian et al. 2012 Whitmore et al 


l.||201ip. 

identifier 


the work that follows, we include all objects identified as 
part of our search. As a result, our catalog may include 
a heterogeneous mix of bound and unbound objects at 
ages <10-30 Myr. We choose this approach in an effort 
to maximize the return for science cases that do not de¬ 
pend on the differentiation between bound clusters and 
unbound associations, while allowing open discussion of 
differing cluster definitions where they affect the result¬ 
ing scientific interpretation. Overall, we seek a catalog of 
objects that are spatially and temporally correlated and 
can be reasonably approximated as simple stellar popu¬ 
lations. While this goal is easily achieved for a majority 
of the sample, we will make a point to identify regions of 
parameter space that contain debatable objects, allow¬ 
ing the reader to make informed decisions with regards 
to boundedness. A full exploration of the question of 
boundedness re quires detailed age and spati al structure 
information (Gieles & Portegies Zwart 2011), which is 
beyond the scope of this work! 

The inclusive philosophy that we adopt in this work 
represents a shift from the approach we took in Paper I, 
where we discarded objects that were classified as likely 
associations. This paper’s inclusive methodology leads to 
a modest ~15% increase in clusters when compared to 
the Year 1 catalog within their shared imaging footprint. 
We discuss these catalog differences in detail in Section 
|6.2[ but find good overall agreement between the two 
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samples. 

2. THE ANDROMEDA PROJECT 

In Paper I, we presented a sample of 601 clusters iden¬ 
tified in a visual search carried out by eight professional 
astronomers, which examined the first 20% of imaging ac¬ 
quired by the PHAT survey. This task was time consum¬ 
ing; the initial identification of cluster candidates and 
subsequent quality ranking of the candidates required 
more than a month of effort from each scientist involved. 
This cost limited our cluster search in two significant 
ways. First, only 3-4 people looked at each image to 
make initial identifications of cluster candidates. Of the 
601 clusters, 23 were originally identified by just a sin¬ 
gle person, suggesting that a small number of additional 
good cluster candidates were probably missed in our ini¬ 
tial search. Second, characterization of the cluster com¬ 
pleteness was done with a sample of 550 synthetic (arti¬ 
ficial) clusters. This relatively small sample of synthetic 
clusters limited our ability to track the completeness as 
a function of age, mass, cluster size, and galactocentric 
radius. 

Our original plan for extending the cluster search to 
the full PHAT footprint was to devise an automated al¬ 
gorithm to identify clusters using the Year 1 sample as a 
training set. This approach proved challenging because 
all of the automated techniques we tested produced sam¬ 
ples with at least as many contaminants as true clusters. 
Expert by-eye verification would have been necessary 
to reduce the number of contaminants to an acceptable 
level. This verification would have been time consuming 
and the resulting catalog would still suffer from subjec¬ 
tivity issues. In addition, the goal of robustly charac¬ 
terizing the catalog selection function becomes difficult, 
requiring an understanding of human and machine be¬ 
havior and their joint interaction. 

The failure to devise a fully automated cluster identifi¬ 
cation technique, combined with the difficulty of scaling 
our original by-eye techniques to the full dataset, led us 
to create the Andromeda Project. This crowd-sourced 
solution allows us to scale a by-eye search to the vol¬ 
ume of data available from PHAT, improve the robust¬ 
ness and repeatability of cluster identifications, and ac¬ 
curately characterize the catalog completeness function. 

2.1. Interface 

The Andromeda Projecl j (AP) is a website built and 
hosted by the Zooniversq^ citizen science platform. The 
AP interface is based on previous tools and code used for 
the Seafloor Explorer project, another Zooniverse project 
that aims to survey scallops, seastars, and other aquatic 
life using underwater imaging. 

Upon entering the AP website, visitors are presented 
with the primary option to start classifying data, as well 
as links to find out more about the project. Individuals 
who start classifying for the first time are directed to a 
tutorial image, where the basic functionality of the clas¬ 
sification screen is explained. The classification screen 
is shown in Figure [T] By default, the site displays a 
color image constructed from F475W and F814W imag¬ 
ing. By clicking on the “B/W” button, participants can 

4 http://www.andromedaproiect.org 

5 http: //www. zooniverse . org 


change the image to an inverted F475W gray scale image 
in which it is often easier to distinguish individual stars 
and faint image features. The site’s marking tool is set 
for cluster identification by default; modes for identify¬ 
ing background galaxies and three types of image arti¬ 
facts are also available. Markers for clusters, galaxies, 
or ghost artifacts are circular, positioned by clicking the 
center of an image feature and dragging outward to select 
the desired radius. Only the cluster and galaxy markings 
are utilized in this paper. 

After clicking on the “Finished” button, volunteers 
are shown the location of the field they were classifying 
within M31 and given the option to discuss the images in 
the AP Tall0 forum. This feature enables new volunteers 
to get help identifying clusters, and allows participants 
to highlight interesting or confusing objects and discuss 
them with other volunteers and the science team. After 
choosing whether or not to enter the Talk forum, volun¬ 
teers are presented a new search image; the AP image 
database ensures that no user sees the same image twice. 

Volunteers are urged to log-in or create a Zooniverse 
account, but participants are allowed to classify an un¬ 
limited number of images as an unregistered user. Un¬ 
registered users do, however, receive periodic messages 
suggesting that they log-in or create an account. Regis¬ 
tration allows analysis of volunteers’ classification behav¬ 
ior using consistent (anonymous) identifiers. Input from 
unregistered users can still be aggregated from within 
a single classification session, however the (anonymous) 
identifiers tend not to carry over from session to session 
and could be shared by multiple unregistered users, lim¬ 
iting the depth of analysis we can perform. 


2.2. Input Data & Synthetic Clusters 

Each search image shown on the AP site was extracted 
from high-resolution (0.05 arcsec pixel -1 ) HST/ACS im¬ 
ages of M31. A vast majority of these images came from 
the PHAT dataset; we show the survey’s imaging foot¬ 
print in Figure [2] The prominent rectangular regions in 
Figure [2] that divide the survey into 23 parts are referred 
to as “bricks”; their numbering increases from SW to 
NE along the major axis, starting with the brick enclos¬ 
ing th e galaxy nucleus, B01 (see Fig. 1 in Dalcanton et al. 
2012|). In addition to the optical (F475W, F814W; equiv¬ 
alent to g , I) ACS images, PHAT also obtained near-UV 
(F275W, F336W; the latter is equivalent to U) and near- 
IR (FI 10W, F160W; similar to J, H) imaging using the 
HST/WFC3 instrument. Additional information about 
PHAT imaging data a nd survey design is available in 
Dalcanton et al. (20121 and Paper I. 

In addition to the PHAT data, we also processed and 
prepared ACS data from the HST archive (PID: 10273; 
PI: Crotts) that covered portions of M31 not imaged by 
PHAT. The imaging footprint for these data are also 
shown in Figure [2] This program obtained two-filter op¬ 
tical imaging using filters (F555W, F814W) similar to 
those used by PHAT, allowing easy incorporation into 
the AP search. Due to significant differences in data 
richness for objects identified in the archival dataset com¬ 
pared to the PHAT imaging, we choose not to include 
these objects in any further analysis, but we present ob¬ 
ject catalogs in Appendix [Ej 


http://talk.andromedaproj ect.org 
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Fig. 1.— The web-based classification interface for the Andromeda Project. The tutorial image used to train participants is shown here, 
which includes all three object types: clusters (yellow), background galaxies (purple), and artifacts (i.e., a saturated star with diffraction 
spikes; red). 


We created AP search images using 725x500 pixel 
(36.25x25 arcsec; ^6.9 x 4.8 pc in projected size) extrac¬ 
tions from single-field ACS images. This subimage size 
efficiently divides the image area, includes 100 pixels of 
overlap between neighboring subimages to reduce incom¬ 
pleteness and biases caused by image edges, and allows 
participants to search images at full resolution. The par¬ 
ent ACS images have missing data due to the camera’s 
chip gap that we filled using overlapping data from neigh¬ 
boring ACS images. Gaps, edges, and other artifacts are 
still present in some images, but our efforts mitigated 
most issues concerning missing data. We created a total 
of 13,017 subimages (4.7 gigapixels) from imaging that 
spans the entire PH AT survey region, as well as 1,728 
addition subimages from archival imaging. 

In addition to the normal imaging, we also produced 
additional search images that included synthetic clusters. 
The primary reason for inserting these synthetic test ob¬ 
jects is to measure the cluster catalog completeness as a 
function of age, mass, size, and environment. In addition, 
the synthetics provided feedback to our volunteers: when 
a participant identified a synthetic cluster, they were no¬ 
tified that the object was synthetic and congratulated on 
their find. Participants on the site’s Talk forum confirm 
that these notifications acted as positive reinforcement 
that they were performing the task they set out to ac¬ 
complish. 

We used the Year 1 cluster catalog results and its small 
number of accompanying synthetic cluster tests to create 
a realistic variety of clusters for insertion into AP search 
images. To begin, we choose age, mass, metallicity, at¬ 
tenuation, and effective radius values for the synthetic 
clusters drawn from distributions in each parameter: 


• Ages were drawn from a flat distribution of discrete 
log(Age/yr) values between 6.6 and 10.1, spaced 
at an increment of 0.05 clex to match the grid of 
stellar is ochrones from the Padova stellar evolut ion 
models ( Marigo et al.|[2008 Girardi et al.||2010 ). 


• Masses were drawn randomly from a continuous 
flat distribution of log(Mass/MQ) between 2.0 and 
5.0, yielding usable sample sizes across the full 
range of masses. 


• Solar metallicity (Z = 0.019) was assumed for ages 
less than 5 Gyr. For ages greater than 5 Gyr, the 
metallicity was selected from a grid of Z to sim¬ 
ulate the presence of metal-poor globular clusters: 
0.0001 (0.005 Z 0 ), 0.001 (0.05 Z & ), 0.004 (0.2 Z e ), 
0.008 (0.4 Z Q ), 0.019 (Z Q ). 


• Extinctions were drawn from an exponential Ay 
distribution ranging from 0.17 mag (fo regrou nd 
Galactic extinction; Schlafly & Finkbeiner 2011) to 
3.0 mag following the expression 


P(Ay) 


o-Av! 1-34 


(1) 


This distribution was chosen to match the extinc¬ 


tion distribution derived by Fouesneau et al. (20141 
from their integrated light fitting"of the Year 
ter catalog. 


cius- 


• Spatial profiles are defined using a|King| (|1962) pro¬ 
file with a fixed concentration (Aidai/Aore = 30) 
and an effective radius ( R e g] equivalent to half- 
light radius) drawn from a distribution of measured 
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Fig. 2.— Map of M31 showing HST imaging footprints, oriented 
such that north is up and east is left. Color-coded regions denote 
various subsets of data: PHAT images searched during 2012 AP 
campaign (white); PHAT images searched during 2013 AP cam¬ 
paign (red); HST archival images searched during 2013 AP cam¬ 
paign (yellow). Bold white regions show areas searched during Year 
1 effort (Paper I). Image Credit: Robert Gendler. 

half-light radii presented in Paper I, but with a lin¬ 
ear bias towards larger radii. We include this bias 
to boost the number of extended objects and ensure 
our ability to characterize the completeness of dif¬ 
fuse clusters. The resulting R e g distribution peaks 
at 1.5 pc (0.39 arcsec) and extends from 0.5-9.0 pc 
(0.13-2.4 arcsec). 


After drawing cluster parameter combinations, we pop¬ 
ulated individual clu ster s tar lis ts using the Padova mod¬ 
els, assuming a |Kroupa p001) IMF. We computed to¬ 
tal luminosities tor each cluster and selected a subset 
for insertion into search images that straddle the de¬ 
tection limit, as computed for the Year 1 catalog. In 
Paper I, we found that the sample was 100% complete 
for clusters brighter than mF 475 W = 18-5 and 0% com¬ 
plete for clusters fainter than rriF 475 W = 23.5. Fur¬ 
thermore, when we take cluster age into account we 
can narrow the range of acceptable rnp 475 w values even 
more: for 6.6 < log(Age/yr) < 8.0 we adopted 18.5 < 
pi F 475 W < 22.0; for 8.0 < log(Age/yr) < 9.0 we adopted 
19.5 < mF 475 W < 22.5;1 and for 9.0 < log(Age/yr) < 
10.0 we adopted 20.0 < mF 475 W < 23.0. These ranges 
allow us to efficiently map the functional form of com¬ 
pleteness as a function of F475W magnitude at all ages. 

Once we were satisfied with the sample, we inserted 
synthetic clusters into F475W and F814W images using 
the DOLPHOT ph otometry package, an updated ver¬ 
sion of HSTphot (Dolphin 2000) that is used by the 


PHAT collaboration for point-spread function photom¬ 
etry. These synthetic clusters were added into search 
images, one cluster per subimage, positioned pseudo- 
randomly within the image but always >120 pixels from 
the image edge. We spatially distributed the synthetic 
clusters across the PHAT survey footprint, covering a 
wide range of galactic environments to ensure our abil¬ 
ity to evaluate completeness throughout M31. We se¬ 
lected fields that sample the survey’s image variety, 
as defined by per-image red giant branch (RBG) star 
count^J We inserted synthetic clusters into fields with 
10 2 < A (RGB) < 10 3 , and inserted proportionally less 
synthetics into fields with iV(RGB) < 400 to achieve a 
uniform number of synthetic clusters per IV(RGB) bin 
within this range. This selection results in the place¬ 
ment of synthetic clusters into regions where a majority 
of cluster identifications are made. 

2.3. Data Collection & Classification Statistics 

We obtained AP data during two rounds of collection; 
the first ran from 5-21 December 2012 and included 72% 
of the PHAT images. The remaining PHAT images and 
archival images were searched between 22-30 October 
2013. Defining a classification as a volunteer’s submitted 
response to a single image (containing zero to many in¬ 
dividual markings), AP volunteers performed a total of 
1.82 million image classifications. This corresponds to an 
average rate of about 70,000 classifications per day; our 
peak classification rate was over 80,000 classifications per 
hour. 

A total of 29,262 unique users participated in the AP; 
9,663 of these participants logged in using a Zooniverse 
account. While the median number of classified images 
among all users was only 3 images (27 when only consid¬ 
ering registered participants), 90.5% of our image classi¬ 
fications were performed by volunteers who examined at 
least 50 images. The distribution of work completed by 
the AP volunteer community is shown in Figure [3] The 
combined effort of Andromeda Project volunteers totals 
approximately 24 months of constant human attention. 

Each image was classified a minimum of 80 times, but 
the distribution of classifications per image extends up to 
108 with a median of 88. The classification counts vary 
slightly between the two rounds of data collection: the 
median for the 2012 campaign is 86, while the median 
for the 2013 campaign is 101. In all, participants made 
>2 million individual cluster and galaxy identifications. 

3. CATALOG CONSTRUCTION 

The primary goal of this work is to construct a cata¬ 
log of clusters from the identifications provided by the 
project’s participants. In this section we describe the 
process of converting clicks to scientifically-valuable data 
products. We evaluate the reliability of the crowd- 
sourced results and choose appropriate catalog thresh¬ 
olds by comparing to the PHAT Year 1 catalog (Paper 
I), an expert-derived “gold standard” reference. 

The first step of catalog construction is to synthesize a 
merged list of identifications. We describe the details of 
our catalog creation algorithm and show examples of its 
application in Appendix [A] To briefly summarize: we 

7 RGB stars are defined as sources with F110W—F160W > 0.5 
and F160W < 21.0; see Section]!] 
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Fig. 3.— Classification statistics for AP. Participants are sorted 
as a function of decreasing contribution to the project and plotted 
logarithmically on the x-axis. Top: Cumulative fraction of 1.82 
million classifications submitted by the top N volunteers. Bottom: 
The number of classifications submitted individually by the iVth 
volunteer. The red dashed lines highlight that half of the classi¬ 
fication work (cumulative fraction = 0.5) was performed by the 
top 543 participants, who each classified >678 images. The blue 
dash-dotted lines highlight that 90.5% of the total classifications 
were submitted by the 4,671 participants who each classified >50 
images. 

spatially merge object identifications on an image-by- 
image basis, then merge these intermediate results into a 
survey-wide catalog. The resulting raw catalog includes 
~54,000 candidate clusters and galaxies, although a vast 
majority of these are low significance detections as we 
discuss below. Synthetic cluster tests are analyzed using 
outputs from the per-image catalogs. Also, artifact iden¬ 
tifications are processed separately from the cluster and 
galaxy identifications and will not be discussed as part 
of this work. 

After assembling a set of candidate objects, we use 
three metrics to identify cluster candidates and separate 
them from galaxies: 

• /cluster - the fraction of volunteers who viewed the 
search image and identified the object as a cluster. 

• /galaxy - the fraction of classifications for an object 
that identified it as a galaxy. 

• /cist+gai - the fraction of volunteers who viewed the 
search image and identified the object as either a 
cluster or a galaxy. 

These quantities are related by: 


/cluster — /clst+gal Y ( 1 /galaxy) 


( 2 ) 


The /duster scores provide relative rankings for AP 
cluster candidates. The top panel of Figure [4] shows the 
overall distribution of /duster scores for all AP identifica¬ 
tions. This plot shows a large number of low significance 



0.0 0.2 0.4 0.6 0.8 1.0 


J cluster 

Fig. 4.— Top: Histogram of /duster values for the full catalog 
of AP identifications. Bottom: Histograms of /duster values for 
cross-matched high-quality (blue solid), possible (black dotted), 
and rejected (red dashed) Year 1 cluster candidates. 


detections with respect to higher significance detections. 
The distribution of / c ist+gai values is nearly identical to 
the /duster distribution. 

We begin our comparison between AP and Year 1 re¬ 
sults by cross-matching the two catalogs. The bottom 
panel of Figure [4] compares the distribution of AP /duster 
scores for three categories of Year 1 cluster cluster clas¬ 
sifications. We confirm the expectation that increasing 
AP /duster scores correlate with a greater likelihood that 
candidates are clusters. 

The distribution of / ga iaxy values is presented in Figure 
[5j The top panel shows a clear bimodality in / ga iaxy val¬ 
ues, signaling a clear cluster-versus-galaxy classification 
preference for a majority of candidate objects. The bot¬ 
tom panel confirms the accuracy of these classification 
preferences; the expert-derived cluster and galaxy clas¬ 
sifications from the Year 1 catalog map to low and high 
/galaxy scores, respectively. We also observe that / ga iaxy 
= 0.3 defines a division between clusters and galaxies 
that leads to a minimal number of misclassifications. 

It is interesting to note that there is an apparent bias 
at intermediate / ga iaxy values (0.3< / ga iaxy < 0.5), such 
that a majority vote of AP participants would not clas¬ 
sify these objects accurately, according to expert-derived 
labels. We hypothesize that this bias may be caused 
by the default cluster setting for the site’s marking tool, 
leading to the tendency to mark candidates, particularly 
questionable ones, as clusters. Whatever the cause may 
be, only a small number of objects in this range of / ga iaxy 
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J galaxy 

Fig. 5. — Top: Histogram of / ga laxy values, where the solid 
(dotted) lines represent the 13,801 (4,449) AP identifications with 
/cist+gal >0.1 (0.5). The bimodality of this distribution shows 
the tendency for AP classifiers to strongly differentiate between 
clusters and galaxies. Bottom: Histogram of / ga laxy values for 
AP identifications cross-matched to the Year 1 galaxy (red dashed 
line) and high-quality cluster (blue solid line) catalogs. An / ga i axy 
threshold of 0.3 divides clusters and galaxies with minimal classi¬ 
fication errors. 

could plausibly be considered for inclusion in the AP cat¬ 
alog as a cluster instead of as a galaxy: there are 125 (13) 
objects with /duster > 0-2 (0.5) in the full catalog of AP 
identifications that fall within 0.3< / ga iaxy < 0.5. Nev¬ 
ertheless, we adopt an / ga iaxy-based selection criterion 
to account for this bias and incorporate as much infor¬ 
mation as possible during classification. We use the ob¬ 
served / ga i a xy = 0.3 threshold throughout the remainder 
of the paper to differentiate cluster and galaxy candi¬ 
dates. 

To select a catalog of likely clusters from the set of 
AP identifications, we use selection criteria based on 
the cluster candidate’s /duster and / ga iaxy values. While 
we’ve clearly defined an / ga iaxy-based selection criterion, 
we now need to define an /duster threshold that maxi¬ 
mizes the number of clusters identified while minimizing 
the number of non-cluster contaminants. As the bot¬ 
tom panel of Figure [4] shows, these are directly compet¬ 
ing goals; decreasing the /duster threshold to include a 
greater number of high-quality clusters necessarily intro¬ 
duces additional contaminants as well. 

To evaluate how our choice of /duster threshold affects 
the resulting cluster catalog, we calculate completeness 
and contamination fractions based on a comparison be¬ 
tween the AP and Year 1 catalogs within their shared 


search footprint in the disk of M31. We exclude the bulge 
region (Brick 1) from our comparison as its classification 
results diffe r sufficiently from the rest of the survey (see 
Section [ aT| for further discussion). We define complete¬ 
ness as the fraction of high-quality Year 1 clusters ac¬ 
cepted by the AP selection criteria. Contamination is 
quantified as the fraction of accepted AP clusters that 
were previously classified as non-clusters or galaxies by 
the Year 1 catalog, or are new AP-only objects not iden¬ 
tified or classified during the Year 1 search. 

We note that these definitions of completeness and con¬ 
tamination make an imperfect assumption that the Year 
1 search is flawless, in which no worthy clusters escaped 
identification and every high-quality cluster tabulated 
deserves that distinction. While this expert-derived cata¬ 
log serves as a useful standard against which we can com¬ 
pare, it is inevitable that the completeness and contam¬ 
ination fractions we calculate with respect to the Year 
1 catalog are approximate: 100% completeness will not 
be attained, and we expect a modest, non-zero contam¬ 
ination fraction. To evaluate previously unidentified ob¬ 
jects, we could perform an expert review to individually 
assess these possible contaminants, however this strat¬ 
egy cannot remove the element of researcher subjectivity. 
Instead, we adopt an explicitly conservative stance that 
affects the absolute values of the contamination fractions 
we derive, but which do not impact the analysis choices 
we make due to the relative nature of most of these de¬ 
cisions. 

We calculate a completeness versus contamination 
curve with respect to the expert-derived Year 1 catalog, 
akin to a receiver operator characteristic (ROC) curve. 
By continuously lowering the /duster threshold for the 
definition of AP clusters, we increase the completeness of 
Year 1 objects identified (bottom panel of Fig.Jb]). How¬ 
ever, the decreasing /d us ter threshold also increases the 
contamination, defined as the fraction of the cluster cat¬ 
alog objects that are either Year 1 non-clusters or new 
AP-only clusters (top panel of Fig. [6]). In addition to the 
initial uniformly-weighted set of object identifications, 
we construct completeness versus contamination curves 
assuming diffe rent user weighting schemes, as discussed 
in Section |3.2| We compare the result from uniformly- 
weighted inputs (red) to the range of results obtained 
from a grid of weighting systems (gray), including the 
curve derived for our optimal weighting scheme (black). 

To choose a catalog cutoff, we seek a metric that iden¬ 
tifies the /duster cutoff value for which the resulting cat¬ 
alog achieves a balance between completeness and con¬ 
tamination. We choose to work directly with the com¬ 
pleteness versus contamination curve and define d op timai, 
the distance from each point along the curve to the opti¬ 
mal corner of the plot (completeness and contamination 
fractions are 1.0 and 0.0, respectively). We note that 
our choice of metric, which values the minimization of 
false positives and false negatives equally, is somewhat 
arbitrary; given a specific use-case, one might prefer a 
metric that optimizes for a greater number of classifi¬ 
cations at the expense of additional contamination. Our 
choice to weight completeness and contamination equally 
is grounded in the goal of creating a general-purpose cat¬ 
alog. Also, when we considering the specific shape of the 
completeness versus contamination curves we are work¬ 
ing with, we find that this metric also tends to select the 
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Fig. 6. — Top: Completeness versus contamination curves that 
result from uniform user weighting (red) and the optimal user 
weighting system (black). The gray shaded region denotes the 
parameter space covered by the sum of all curves deri ved for the 
grid of weighting systems we tested (see Section |3.2| >. Bottom: 
The /duster thresholds used by the uniform and adopted weight¬ 
ing systems as a function of Year 1 completeness. The vertical 
dashed lines in both panels denote the catalog limits adopted for 
each system based on the ^optimal metric. 

approximate point of diminishing return, the limit be- 
yond which relaxing the catalog threshold tends to add 
more contaminants than additional good objects. On the 
completeness versus contamination plot this limit corre¬ 
sponds to the point at which the curve is tangent to a 
line with a slope of unity. In addition, it is also comfort¬ 
ing that our choice of metric also tends to approximately 
conserve the number of clusters within the Year 1 foot¬ 
print, yielding a similar number of clusters as found in 
Paper I. Together, the similarity of these limits gives us 
confidence that our specific choice of cutoff is appropri¬ 
ate. 

We use the ((optimal metric to identify an optimal com¬ 
pleteness and contamination combination of 85.3% and 
10.5%, respectively, for the case of uniform user weight¬ 
ing; the corresponding /duster cutoff is plotted in Figure 

S which is tabulated along with other corresponding in- 
irmation in Table IT] We improve sample completeness 
and contamination fractions usin g a user weighting sys¬ 
tem, as we discuss in Section [3~2j 
We select a catalog of likely background galaxies using 
a combination of / c i s t+gai and / ga iaxy selection criteria in 
a process similar to the one described here for the clus¬ 
ters. We document that analysis and its accompanying 
details in Appendix [B] 




Average A (R1-R2) / a(/ cl „,„) 

Fig. 7. — Left: Comparison of /duster f° r clusters derived from 
the 2012 campaign (Rl) data versus those from the 2013 campaign 
(R2). Black points reflect measurements made in normal images, 
red points reflect measurements made in synthetic images. We 
plot 1, 2, and 3cr contours showing the scatter predicted by our 
noise model. Right: Histogram of /duster differences scaled by 
the expected dispersion. A Gaussian function with cr=l and a 
peak value of 350 is overlaid for reference. The dispersion of the 
/cluster differences between the two rounds matches the statistical 
expectation of the noise model. 

3-1. /cluster Uncertainties & Robustness 

To demonstrate the robustness of our / c i us ter metric, 
quantify its associated uncertainties, and establish its 
consistency across two separate rounds of data collec¬ 
tion, we carried out a repeatability experiment during 
the 2013 campaign. We selected 741 images (397 nor¬ 
mal, 344 synthetic) searched during the 2012 campaign 
(Round 1; Rl) that included highly-ranked cluster candi¬ 
dates and repeated data collection for these images dur¬ 
ing the 2013 campaign (Round 2; R2). We match the 
catalogs that emerge from each run and compare /duster 
scores for 1,241 objects whose Rl and R2 scores aver¬ 
age to >0.35 (891 from normal data, 350 from synthetic 
data) to test the repeatability of /duster scores for likely 
clusters. We present the distribution of /duster differ¬ 
ences between the two rounds in Figure [7] We model the 
A/cluster (Rl — R2) scatter using an expression for the 
combined variance of two drawing experiments governed 
by the binomial distribution: 

, ^ /M 1 -P) 

= V N ’ (3) 

where N = 88, representing the median number of image 
views, and p is the averaged Rl and R2 /duster score for 
each object, representing our best estimate of an object’s 
“true” /duster value. We plot 1, 2, and 3er contours as 
predicted by our noise model, which accurately captures 
the scatter shown in the data. 

These results demonstrate that image classifications 
collected during the 2012 and 2013 campaigns are func¬ 
tionally equivalent, allowing us to easily combine data 
from the two rounds. This experiment also shows that 
our procedure of combining >80 image classifications 
from the pool of AP participants provides consistent 
/duster results with minimal systematic biases. 

3.2. User Weighting 

Up to this point, we have assumed that the abilities 
of all classifiers are equal on average. In this section 
we investigate whether weighting individual volunteers 
based on the quality of their classifications can improve 
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the cluster sample. User weighting has been ap plied in 


severa l othe r citizen science projects (Lintott et al. 2008 v 
Schwamb et al.|[2012) and seems naturally applicable to 


our AP data. In line with these previous implemen¬ 
tations, we calculate weightings based on the level of 
agreement between a participant’s classifications and the 
consensus opinion of all the volunteers. Individuals who 
agree with the consensus opinion are up-weighted, while 
those who disagree with the consensus opinion are down¬ 
weighted. Expanding beyond previous implementations, 
we vary the strength and form of weighting, evaluating 
the success of each iteration by comparing completeness 
versus contamination curves (derived through compari¬ 
son to the Year 1 sample) to the unweighted case pre¬ 
sented earlier in Section [3j 

We could have chosen another way to assign weights, 
such as assessing a volunteer’s performance with respect 
to expert-derived Year 1 results, or basing weights on 
a participant’s recovery rate of synthetic clusters. One 
downside suffered by both of these alternative methods: 
resulting weights are based on only a fraction of the avail¬ 
able classification data. Decreasing the volume of clas¬ 
sifications considered by the weighting system leads to 
an increasing number of participants with little or no as¬ 
sessment information, and noisier ability estimates for 
every volunteer. Additionally, weighting systems tend to 
produce catalogs that resemble the data used for training 
and calibration. We were concerned that defining weights 
based on data that did not sample the variety and pa¬ 
rameter ranges included in the full cluster sample might 
result in unwanted biases. Particularly in the case of the 
synthetic cluster data, which was specifically designed 
to characterize cluster recovery near the detection limit, 
these biases could be significant. To exploit the unique 
benefits provided by our crowd-sourced methodology, we 
utilize the unfiltered opinion of AP volunteers. 

Figure [8] shows two quantities that we use to charac¬ 
terize the performance of our volunteers: the fraction of 
consensus clusters a volunteer identified, /consensus, and 
the mean /cluster of all cluster identifications made by a 
volunteer, /duster- We define consensus clusters as ob¬ 
jects that show a high degree of agreement among AP 
participants, where /duster > 0.6 and / ga laxy < 0.2; these 
limits provide a sample with a sufficient number of clus¬ 
ters to enable weighting of individual participants while 
ensuring that weights are not based on questionable can¬ 
didates (see Figure^]). 

Examination of Figure [8] reveals that there is wide vari¬ 
ation of classification behavior among AP volunteers. In¬ 
dividuals that lie in the upper left part of the plot are con¬ 
servative classifiers; everything they clicked was an obvi¬ 
ous cluster, leaving many consensus clusters unmarked. 
On the other hand, participants in the lower right are lib¬ 
eral classifiers; they identified a large fraction of consen¬ 
sus cluster sample, but also identified many other low- 
ranked objects that are not likely clusters. Volunteers 
with scores that lie in the upper right portion of Figure [8] 
are desirable classifiers, obtaining high completeness but 
with little sacrifice to the overall quality of their identi¬ 
fications. We note that because of the intrinsic /duster 
distribution of the good clusters, volunteers who identify 
a large fraction of the good clusters cannot have an aver¬ 
age /duster of 1.0; we compute the upper limit to average 
/duster based on the /duster distribution of good clusters 
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Fig. 8. — Performance metrics for 4,671 volunteers who classi¬ 
fied >50 AP search images. The x-axis represents /consensus, the 
fraction of consensus clusters (/duster > 0-6 an d /galaxy <0.2) iden¬ 
tified by each participant out of the total number of consensus clus¬ 
ters they saw. The y-axis represents /duster, the average /duster 
value of all clusters identified by that volunteer. The dotted line 
represents an approximate ceiling to /duster values as a function of 
/consensus, calculated by considering the intrinsic /duster distribu¬ 
tion of the consensus cluster sample. Conservative classifiers, those 
that identify only the best cluster candidates, lie in the upper left 
portion of the plot. Liberal classifiers, those that identify most 
good clusters but also identify many low-ranked candidates, lie in 
the bottom right portion of the plot. 


and plot this envelope as a dashed line in Figure [8] 

To make best use of classifications from both conser¬ 
vative and liberal cluster identifiers, we apply separate 
weightings to volunteer’s detections and non-detections. 
Specifically, we weight a participant’s detections based on 
the average /cluster of clusters they identified, while non¬ 
detections are weighted based on /consensus, the fraction 
of consensus clusters the volunteer identified. As an ex¬ 
ample: in the case where a liberal classifier in the lower 
right corner of Fig. [8] did not click on a cluster, their 
non-detections are up-weighted because they are known 
to identify most good clusters. The detections from the 
same classifier, however, are down-weighted because this 
individual identifies many low-quality cluster candidates 
in addition to the high-quality ones. 

We adopt a threshold number of subimage classifica¬ 
tions above which we can assume we have adequately 
characterized a participant’s classification behavior. Vol¬ 
unteers with fewer than 50 subimage classifications are 
distributed with greater randomness across the /duster 
versus /consensus plane, suggesting large uncertainties in 
the values of their performance metrics; we adopt 50 clas¬ 
sifications as the threshold. Individuals who fall below 
this classification threshold are assigned mean detection 
and non-detection weights. Even when this limit is im¬ 
posed, ~90% of all image classifications are weighted us¬ 
ing individually determined user weights. We note that 
anonymous accounts from unregistered users are treated 
in the same way as those from registered users for weight¬ 
ing purposes. Most of these users are assigned mean de¬ 
tection and non-detection weights due to the fact that 
they submit a small number of classifications (median 
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Fig. 9. — A comparison between optimally-weighted /duster w 
scores and uniformly-weighted /duster values, showing the impact 
of user weighting on individual object scores. The red lines show 
the median trend and one standard deviation around the median. 
Horizontal dashed lines denote the /duster w cutoffs corresponding 
to each of the printed Year 1 completeness fractions, while the 
vertical dashed line denotes the approximate /duster value that 
corresponds to the optimal /duster w cutoff. 

number of classifications is 2); ~5% of unregistered users 
surpass the minimum subimage threshold for individual 
weight assignment. 

Next we determine how to translate performance met¬ 
ric scores into relative user weights. We adopt a general 
form for the transformation based on the generalized lo¬ 
gistic function. Favorable aspects of this functional form 
include its tunable scaling and that it allows for the sat¬ 
uration of weights at high and low input metric scores. 
Our “constrained” logistic function is defined as: 

w (x) = B x (a + 1 + e _ miogistic(;c _ &logistic )) . ( 4 ) 

where x represents the input performance metric (either 

/cluster Or /consensus ) while miogistic and £»i og istic are the 
slope (growth rate) and the offset (position of maximum 
growth) of the logistic curve, respectively. The variables 
A and B are normalizations set such that W varies from 0 
to 1 over the interval x = [0,1], providing the constrained 
aspect of this function. Once a set of logistic function 
parameters have been chosen for the detection and non¬ 
detection weighting functions, we apply user weightings 
to individual cluster votes on an image-by-image basis 
and recalculate weighted /duster values, /duster, w- 

We vary the input logistic function parameters to 
search for a set of values that produce the best possi¬ 
ble weighted catalog. We construct a grid of weighting 
systems by varying the values of the four free parame¬ 
ters: the slope and offset values for both the detection 
and non-detection weights. For each set of parameters, 
we calculate a completeness versus contamination curve 
and its corresponding minimum distance to the corner of 
optimal completeness and contamination, d op timai- We 
gradually extended the weighting grid to include an in¬ 
creasing range of logistic parameter values until we iden¬ 


tified a minimum d op timai value that was unsurpassed. 
We defined the set of parameters that yielded this mini¬ 
mum d op timai value as our optimal AP weighting system. 

The range of completeness versus contamination curves 
is represented by the gray region in the top panel of 
Figure [6] We also plot the individual curve derived 
for the optimal weighting system and list its logistic 
function parameters in Table [l] The optimal weight¬ 
ing system provides a contamination fraction of 9.8% at 
a completeness of 88.1%. When compared to the uni¬ 
form weighting results, applying user weighting decreases 
the number of contaminants by 36% (from /contamination 
of 0.152 to 0.098 at completeness of 88.1%), or alter¬ 
nately increases completeness from 84.6% to 88.1% (at 
/contamination of 0.098). While user weighting does not 
dramatically change the total number of cataloged clus¬ 
ters or the Year 1 completeness percentage, we are able 
to reduce the number of possible contaminants by a sig¬ 
nificant amount. 

We compare original versus weighted /duster values to 
illustrate the impact of user weighting on individual clus¬ 
ters. Figure [9] shows that user weighting tends to in¬ 
crease the separation between high and low /duster ob¬ 
jects, providing better differentiation at moderate /duster 
values that lie near the catalog cutoff. To visualize how 
the choice of /duster,w cutoff affects the output cluster 
catalog, we represent four different threshold values as 
horizontal lines in the figure, each labeled according to 
its corresponding Year 1 completeness fraction. We also 
plot a vertical line in Figure [9] representing the approx¬ 
imate /duster cutoff that best approximates the optimal 
/duster, w threshold. 

The user weighting applied here enhanced final AP cat¬ 
alog results by achieving small but quantifiable improve¬ 
ments through a combination of decreased contamina¬ 
tion and increased completeness. We note that we were 
fortunate to obtain a large number of classifications per 
image (>80) allowing us to account for variations in par¬ 
ticipant performance by averaging over a large number 
of classifications. Many citizen science projects cannot 
afford to collect a similar number of per-image classifica¬ 
tions because they need to distribute effort over a larger 
volume of data, or because the project is working on 
time-sensitive tasks that cannot wait for additional in¬ 
put to be collected. In these cases, we expect that user 
weighting would play an essential role in obtaining high- 
quality results. 

We utilize the /duster,w values as defined by the op¬ 
timal user weighting system throughout the rest of the 
paper. 

4. CATALOG COMPLETENESS 

We in troduced our set of synthetic cluster tests in Sec¬ 
tion j2?2j here we present catalog completeness results de¬ 
rived from those tests, including how catalog complete¬ 
ness correlates with properties of the clusters and their 
surrounding fields. 

The traditional method of characterizing the complete¬ 
ness of a cluster catalog is to identify the 50% complete¬ 
ness limit as a function of cluster luminosity. The two 
plots in the left column of Figure [To] show the behavior 
of the 50% completeness limit in F475W as a function of 
cluster age for the full sample of synthetic clusters. These 
plots show that while the synthetic results at log(Age/yr) 
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Fig. 10.— Completeness results from synthetic cluster analysis. Top Panels: Detection results for individual synthetic clusters (black 
= detected, red = not detected), as well as 50% completeness limits calculated for each age bin. Bottom Panels: Completeness functions 
for each age bin, color-coded to match their corresponding bin in the top plot. Results as a function of F475W magnitude, mass, and 
F475W-3 magnitude are presented in the left, center, and right columns, respectively. F475W—3 magnitudes represent the cluster flux 
that remains after subtracting the contribution of the cluster’s three most luminous members. 


> 8.0 agree with a single, age-independent magnitude 
limit at F475W ~ 21.5, there is an apparent age depen¬ 
dence at younger ages. This result conflicts with the 
standard assumption that luminosity-based complete¬ 
ness limits for cluster catalogs are independent of age. 

To understand why we find brighter, non-constant 
completeness limits at ages <100 Myr, we examine our 
completeness results as a function of cluster mass, pre¬ 
sented in the middle column of Figure |To| Under the 
assumption of an age-independent, constant luminosity 
completeness limit, we would expect a continuous in¬ 
crease in the 50% mass completeness with increasing 
age due to stellar evolution driven fading of the clus¬ 
ter’s stars. In contrast to these expectations, we find a 
near-constant 50% completeness limit for log(Age/yr) < 
8.0 of ~500 M 0 . It appears that catalog completeness 
correlates with cluster mass rather than luminosity at 
ages <100 Myr. 

To explain the observed completeness behavior, it is 
important to note that nearly every synthetic cluster we 
tested with an age <100 Myr has a mass <3x10 3 Mq. 
The integrated light of young low mass clusters is domi¬ 
nated by a small number of bright stars. This fact leads 
to large stochastic variations in the total integratedJight 
for a sample of clusters with identical masses (see Foues- 


neau &; Langon|20ldj |Beerm a n et al.|2012} |Popescu et al. 

2012|). In addition, cluster identification in HST imaging 

of 1V131 relies greatly on the presence of an over-density 
of individually resolved stars, such that the number of 
observable stars might correlate better with a cluster’s 
detection probability than its total luminosity in this 
low-mass regime. In this case, the correlation between 
completeness and mass is explained by a strong correla¬ 


tion between mass and the number of bright, observable 
cluster members. 

We conclude that there are two regimes for AP cluster 
catalog completeness: for ages <100 Myr, cluster de¬ 
tection is limited by the number of observable member 
stars; for ages >100 Myr, cluster detection is governed 
by the total cluster luminosity. To bridge these regimes, 
we devise a single cluster metric that correlates strongly 
with the 50% catalog completeness limit, independent of 
cluster age: F475W-3, the F475W magnitude remaining 
after subtracting the flux contribution from the cluster’s 
three brightest stars. By excluding the contribution of 
the three brightest cluster stars, we significantly reduce 
the stochastic variation in cluster luminosity that im¬ 
printed an age-dependence into the completeness results. 
We experimented with the number of stars to exclude 
and found that three provided th e be st correction. The 
plots in the right column in Figure [10] show that our data 
are consistent with a single, age-independent 50% com¬ 
pleteness limit at a F475W_3 magnitude of 21.65, where 
the new metric successfully unifies the two completeness 
regimes. 

Using the results derived from the full set of synthetic 
cluster tests as a baseline, we can test whether complete¬ 
ness depends on two other important factors: the spatial 
profile of the cluster and the characteristics of the field 
surrounding the cluster. At a fixed luminosity, we expect 
the completeness to worsen for larger, more extended 
clusters because the same total luminosity is spread over 
a larger area, causing the contrast between cluster and 
underlying background to decrease. Likewise, the cluster 
to background contrast also decreases as the background 
surface brightness and stellar density increase, which also 
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Fig. 11.— Deviations from average completeness in F475W—3 
magnitude as a function of R ,.\\. iV(RGB), and iV(MS) in the top, 
middle, and bottom panels, respectively. The dashed line repre¬ 
sents the baseline 50% completeness level of F475W—3 of 21.65. 
Seven bins divide the synthetic cluster sample into equal parts 
(N ~ 440) as a function of each cluster variable. 



leads to a prediction of brighter cluster luminosity com¬ 
pleteness limits. 

Contrary to the simple expectation, we observe non¬ 
monotonic behavior in the 50% completeness limit as a 
function of a cluster’s effective radius equivalent 

to the hal f-light radius), as shown in the top panel of 
Figure [Tl] While the 50% completeness limit reaches its 
faintest value at log(i? c ff/arcsec) ~ —0.35, detection lim¬ 
its worsen as clusters become more extended, the limits 
also worsen as clusters become more compact. Detection 
becomes more difficult at small R e g due to the inabil¬ 
ity for an image classifier to distinguish between single 
sources and a compact coll ection of individual st ars. T his 
behavior was also seen by Silva-Villa & Larsen (2011) in 
their study of extragalactic clusters. The variation in 
l? G ff can cause F475W-3 50% completeness limits to de¬ 
viate by >0.5 mag from the baseline level, translating to 
a mass completeness difference of up to 0.15-0.2 dex. 

Background stellar density, on the other hand, shows 
the expected behavior that higher stellar density makes 
cluster detection more difficult. We quantify local stel¬ 
lar density by counting the number of red giant branch 
(RGB) and main sequence (MS) stars that lie within the 
search images (36.25 x 25 arcsec) that host each synthetic 
cluster. These counts are based on the survey-wide 6 - 
band GST photometric catalogs (Williams et al. 2014), 
where we define RGB stars as sources with F110W- 
F160W > 0.5 and F160W < 21.0, and MS stars as 
sources with F475W-F814W < 1.0 and F814W < 25.0. 


The middle and bottom panels of Figure [Tl] show 50% 
catalog completeness limits as a function of TV(RGB) 
and JV(MS). As a function of IV(RGB) and N(MS), the 
F475W-3 50% completeness limits vary by ~0.5 mag, 
translating to a mass completeness difference of up to 
0.15-0.2 dex. This dependency affects the detection of 
PHAT clusters in the inner disk and bulge, as well as 
those within dense star forming regions - especially those 
located within the ~10 kpc ring. 

To supplement the above description of overall, 
sample-wide completeness behavior, we present a table of 
object-by-object completeness test results in Appendix 
[Cl These results allow catalog users to calculate com¬ 
pleteness functions for specific spatial regions or over a 
custom range in parameter space. 

5. RESULTS 

5.1. AP Cluster Catalog 

We apply the catalog construction techniques and user 
weighting methodology presented in Section [3] to define 
an AP cluster catalog, adopting final selection criteria of: 

/cluster,W > 0.6416 AND /galaxy < 0.3. (5) 

These criteria yield a sample of 2,714 clusters. We add 
two additional sets of clusters to these initial selections. 
First, we add 35 clusters to the sample that are located in 
the bulge-dominated region within ~3 kpc of M31’s cen¬ 
ter, as defined by an ellipse with a center of (10.684575, 
+41.268972), semi-major axis of 815 arcsec, semi-minor 
axis of 410 arcsec, position angle of 45 degrees, and 
bounded by the PHAT footprint. These objects are pri¬ 
marily globular clusters that were identified and con¬ 
firmed by previous surveys. These objects suffer from 
systematically low / c i us t e r,w scores due to their atypical 
appearance (compact and smooth with few individually 
resolved stars), high-surface brightness backgrounds, and 
suboptimal search image scalings. We decided that the 
most straightforward solution to correct for these missed 
objects was to include all previously confirmed clusters 
(high-quality Year 1 or RBC flag of 1) that lie within 
the defined region and evaluate all candidate or possi¬ 
ble objects. We confirmed by-eye that each of the pre¬ 
viously confirmed objects has an appearance consistent 
with that of a cluster, and confirmed two additional can¬ 
didate objects. Second, we include 4 additional expert 
cluster identifications from the B03 tooth images that 
were not included in the AP search due to delayed data 
availability. 

The final AP cluster catalog includes 2,753 objects. 
Figure [12] shows the positions of the clusters within the 
PHAT survey footprint. Andromeda’s ~10 kpc star¬ 
forming ring is a prominent feature visible in the cluster’s 
spatial distribution. We assign identifiers in descending 
order of their maximum per-image, uniformly-weighted 
/cist+gai score. Positions, aperture sizes, and other rele¬ 
vant catalog metadata are presented in Table [2] 

All AP candidates with / c i s t+gai >0.1 that are not 
included in the cluster catalog (or galaxy catalog; see 
Appendix |B) are listed in an ancillary table in Appendix 
□ We include information on these additional candidates 
to allow other workers the opportunity to make different 
choices concerning catalog selection. 
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5.2. Comparison to Previous Cluster Catalogs 

We cross-match our AP cluster catalog with a selection 
of pre viously published cata logs: the Year 1 catalog, the 


RBC, |Cald well et al. 


(2009), and the HKC. We include 


Fig. 12.— Spatial distribution of AP cluster catalog overlaid on the PH AT survey-wide F475W image. The red ellipse denotes the bulge 
region within which the catalog completeness and object recovery vary significantly from the rest of the survey. 

classifications for these conflicting cases. We also find 
g ood a greement between the AP and the Caldwell et ah 
(2009) catalogs. Only 18 conflicts arise from the Cald¬ 
well catalog (8 AP clusters are not Caldwell clusters, 10 
Caldwell clusters are not AP clusters), while 232 cluster 
classifications are common to both the Caldwell and the 
AP catalogs. 

Finally, we compare the AP catalog with the HKC cat¬ 
alog compilation. These clusters represent the low-mass 
additions to previous ground-based catalogs provided by 
early targeted HST observations, and therefore include 
many objects that lie at or near completeness limits. As 
such, a direct comparison shows 156 previously identi¬ 
fied clusters confirmed by our AP classifications, while 
57 are not confirmed. This 73% yield is nearly identical 
to the 72% yield we found for the Year 1 catalog dur¬ 
ing a similar comparison exercise. A vast majority of 
HKC objects that were not confirmed by the AP catalog 
are borderline, marginal candidates where there is a sub¬ 
jective difference in opinion between the HKC authors 
and the consensus judgement of AP volunteers; rejected 
objects are distributed uniformly in /duster,Wj such that 
half of these rejected objects have /duster,w > 0.3. 

Overall, the comparison between the AP catalog and 
previous non-PHAT M31 cluster catalogs shows good 
agreement with few conflicting classifications. A total 
of 733 unique, previously cataloged objects (both cluster 
and non-cluster classifications) match to AP candidates; 
468 were previous (confirmed) cluster classifications, of 
which 404 were confirmed by the AP catalog. Within 
the PH AT survey footprint, we have increased the sam¬ 
ple of confirmed clusters by a factor of ~6 (from 468 
to 2753). The HST-based AP catalog provides improve¬ 
ment in terms of catalog completeness and quality, and 
builds upon the firm foundation laid by these previous 
works. Commentary on individual classification differ¬ 
ences can be found Appendix [D] 

5.3. Integrated Photometry 

We perform integrated aperture photometry for each of 
the AP catalog entries. Our photometry procedures are 
described in Paper I; we summarize the main ideas here, 


alternate identifiers for previously classified objects in 
Table [2] and summarize the high degree of consistency 
between the AP catalog and previous results below. 

By design, the AP catalog bears a strong resemblance 
to the Year 1 catalog. When we consider the portion 
of the AP catalog that lies within the Year 1 imaging 
footprint (including B01, differing slightly from the Sec¬ 
tion [3] analysis), we count 688 clusters, which is a 14.5% 
increase over the 601 object Year 1 catalog. The agree¬ 
ment between the two samples is good: the AP catalog 
includes 88.5% (532/601) of the good Year 1 clusters, and 
91% of the AP cluster catalog were previously classified 
as high-quality or possible Year 1 objects. The AP cat¬ 
alog includes 39 Year 1 catalog rejections and 22 objects 
not classified in the Year 1 search. While the majority 
of object-by-object classification differences are caused 
by clusters with /duster,w scores that lie near the cata¬ 
log cutoff, we discuss a number of meaningful s ystem atic 
differences between the two catalogs in Section [672] 

Com parison of the AP catalog to the RBC and the 
Caldwell et al. (20091 catalog provides an opportunity to 
cross-reference witn commonly cited sources, linking our 
present work to a wealth of ancillary information about 
these clusters, including a great deal of follow-up spec¬ 
troscopy. These ground-based catalogs do not reach the 
faint objects accessible to the PHAT imaging, therefore 
the following comparison mostly consists of verifying or 
discarding previously unconfirmed candidates that lie at 
the middle or bright end of the AP sample. 

Cross-matching the AP catalog with the RBC, we find 
that 260 previously confirmed, candidate, or controver¬ 
sial clusters (RBC flag = 1, 2, or 3) match to AP clus¬ 
ters, while 42 AP classifications conflict with those from 
the RBC (40 AP clusters are not RBC clusters, 2 RBC 
clusters are not AP clusters), and 18 additional RBC 
candidate or controversial classifications were rejected. 
PHAT’s high spatial resolution imaging is often used as 
a definitive tool for classifying objects, so we defer to AP 
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but refer the reader to that paper for additional details. 
We use the mean center and median radius of an object’s 
merged classifications to define the position and radius 
(l? a p) of the photometric aperture. The sky background 
is calculated within an annulus ten times the size of the 
photometric aperture, extending from 1.2 R ap to ~3.4 
R ap . Photometric uncertainties are dominated by uncer¬ 
tainties in the sky background determination; this source 
of uncertainty is often ignored in extragalactic cluster 
photometry. Identical apertures (constant angular size) 
are employed across all six PHAT images. Aperture mag¬ 
nitudes for significant detections (S/N > 3 with respect 
to the variation in the sky background) are listed for 
each photometric passband in Table [2] 3<r upper limits 
are provided for non-detections, and blank entries denote 
incomplete image coverage. 

We obtain photometric R e s estimates by interpolat¬ 
ing radial flux profiles. These values are then used to 
derive aperture corrections, which help account for clus¬ 
ter light that falls outside of the photometric aperture. 
We compare synthetic cluster input luminosities to mea¬ 
sured magnitudes and find that this effect causes losses 
on the order of 0.1-0.3 mag. Corrections assume a |King| 
(1962) profile with a concentration (c = Rtidai/-Rcore) 
of ^scaled to match the cluster’s photometrically de¬ 
termined F475W R e ff, then extrapolated to radii beyond 
R ap to obtain a magnitude correction, WApCor- Aperture 
corrections can be applied to raw aperture magnitudes 
to obtain total magnitude^] estimates. These estimates 
accurately recover the photometry of synthetic clusters 
with no bias at brighter magnitudes (<19) and <0.2 mag 
bias for fainter clusters (see Sec. 4.2 in Paper I). 

We summarize the photometric measurements in Table 
[3] where we tabulate the number of detections in each 
band, as well as the number of objects with detections 
spanning various combinations of photometric bands. 

We found that accounting for the presence of image 
artifacts was critical to obtain accurate cluster photom¬ 
etry in the F275W, F336W, and F110W filter band- 
passes. Images in these three wavelengths proved prob¬ 
lematic due to their small number of repeat observa¬ 
tions and minimal spatial overlap between neighboring 
images, hindering typical artifact rejection techniques 
that require three or more exposures. Interpolating over 
pipeline-rejected pixels in the F110W images was rela¬ 
tively straightforward, however detecting and rejecting 
UV cosmic ray image artifacts was more difficult. We 
conservatively identify F275W and F336W artifacts by 
flagging bright, single-passband objects by comparing 
flux ratios of F275W, F336W, and F475W images. This 
method allows us to reject hundreds of artifacts that tend 
to bias measurements to brighter magnitudes, however 
we caution that some uncorrected artifacts may continue 
to affect our UV photometry. 

In Figure [13] we compare the distributions of F475W 
magnitudes for previously known (and confirmed) clus¬ 
ters in the PHAT footprint and the new AP catalog. The 
factor of ~6 increase in the number of clusters shows the 
staggering improvement made possible by PHAT’s high 
spatial resolution imaging. The ability to differentiate 
between single bright stars and compact clusters allows 
us to include fainter, less massive clusters in the AP cat- 


m Total — Ul a p I" ^ApCor 



Fig. 13.— Histogram of F475W integrated magnitudes for 2,717 
AP clusters (out of 2,753 total). The red dotted histogram rep¬ 
resents the distribution of luminosities for 401 previously known 
clusters confirmed by the AP catalog (out of 404 total) that lie 
within the PHAT footprint, showing the vast improvement in clus¬ 
ter identification provided by the PHAT data. 
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Fig. 14.— Color-color diagram of 1,701 clusters with F336W, 
F475W, and F814W photometric detections. The 378 clusters with 
F475W < 19.5 are distinguished as red squares. The colo r-color se¬ 
quence of luminous globular clusters (see text in Sec. |5.3| l is promi¬ 
nent in the sample of bright clusters. 

alog. Ground-based imaging limited previous efforts to 
clusters brighter than F475W ~ 19.5, and while the HKC 
pushed that limit faint-ward, the amount of HST imag¬ 
ing available to those authors was significantly less than 
what is now available through PHAT. 

Figures [14] and [15] show the color-color and color- 
magnitude""distributions of AP clusters, providing a 
glimpse into the age composition of the catalog. While 
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Fig. 15.— Color-magnitude diagram of 2,464 clusters with 
F336W and F475W photometric detections. 


the clusters span a wide range of colors that re¬ 
flect a diversity in ages, a dominant portion of the 
catalog lies within the following color and magni¬ 
tude range: 20<F475W<22, 0<F336W-F475W<1, and 
1<F475W—F814W<2. The specified region of color and 
magnitude parameter space points to a dominant popu¬ 
lation of ~10 3 M 0 , ~200-400 Myr old clusters that dom¬ 
inate the catalog by number, consistent with the age dis- 


tribution found for the Year 1 sample (Fouesneau et al. 


2014|. This population dominates the cluster catalog 


because it represents a relatively large linear age range 
(leading to large number of clusters for a near constant 
formation history) where catalog completeness is still rel¬ 
atively high (50% complete to ~1,000 M 0 at 300 Myr). 
We note that the large color dispersion shown in Fig¬ 
ure [14] agrees with predictio ns of stochastically-sam pled 
cluster models (see Fig. 4 in Fouesneau et al. 20141. In 
addition, the vertical sequence spanning 15<1T75VV<19 
with a color range of 0.2<F336W—F475W<1.3 in Figure 

S represents the old (10-14 Gyr), massive (> 10 5 M 0 ) 
u ibular clusters. These massive, luminous systems also 
form a well-defined sequence of bright clusters in Figure 


14 running from (F475W-F814W, F336W-F475W) co¬ 
ordinates of approximately (1.4,0.2) to (2.1,1.3), cor¬ 
responding to a metallicity sequence running from - 
2.5<[Fe/H]<0.0 for these systems. 

We fit luminosity functions to the cluster photometry 
using a simple power law (N cx L~ aL )] we plot the results 
in Figure [16} Notably, when we remove objects that lie 
with in the previously defined bulge region (see Section 


5.11, we find that luminosity functions steepen signifi¬ 


cantly. As we argued in Paper I, old massive globular 
clusters dominate the bright end of the luminosity func¬ 
tion; removing these objects, which reside primarily in 
the galaxy’s bulge, allows us to examine the luminosity 
function behavior of younger (<3 Gyr) cluster popula¬ 
tions. The observed population-dependent variations in 


luminosity function indices affirm that factors such as the 
underlying cluster formation history, the intrinsic cluster 
mass function, and the stochastic conversion from mass 
to luminosity for less massive clusters all play a role in de¬ 
termining the overall distribution of cluster luminosities. 
Untangling these various effects for the PHAT sample is 
possible through direct age and mass determinations of 
the individual clusters; we will perform this analysis as 
part of future work (Beerman et al., in prep.). 

6. DISCUSSION 

6.1. Comparing the M31 Cluster Catalog to 
Extragalactic and Galactic Samples 

To place the PHAT catalog of M31 star clusters into 
context, first we compare the luminosity distribution 
of our sample t o those from three nearby star-forming 


et al. 

2010) 

et al. 

2012) 


2010), and the LM(J ( Hunter et al.||200.'i[ Popescu 


2012). We choose these three galaxies because they 


are well-known extragalactic cluster targets that have 
publicly-available cluster catalogs; we compare our sam¬ 
ple to the much more heterogenous Milky Way catalog 
in the next subsection. We compare the luminosity dis¬ 
tributions of each sample in the left panel of Figure [lTJ 
where we convert from PHAT’s F475W to U-band appar¬ 
ent magnitudes using the following empirical relation: 

mv = m.F475W — 0.363(?7 Tf 475W — ?7lF814w) — 0.111. (6) 

Completeness limits for the three samples scale as a 
function of distance: M83 has the brightest completeness 
limit at My ~ —6, followed by M33’s limit at My ~ 
—5.5, and the LMC’s limit at My ~ —4.5. The M31 
detection limit of My ~ —3.5 leads to the inclusion of 
many more clusters, particularly those of moderate mass 
and intermedi ate ages: 10 3 -10 4 M, m between 100 Myr 
and 1-3 Gyr (Fouesneau et al. 2014). As a result, the 
PHAT sample contains ~3 times more clusters than any 
of the other extragalactic samples compared here. 

At bright magnitudes {My < —6) where all four clus¬ 
ter samples are complete, we can compare the num¬ 
ber of luminous blue (B — V < 0.5, or equivalently 
F475W—F814W < 1.1) clusters in each sample. This 
provides a first-order comparison of the young cluster 
populations captured by the catalogs of our set of com¬ 
parison galaxies. We show in the right panel of Figure 
[17] that the M83 catalog includes the largest number of 
blue clusters, followed in order by M33, the LMC, and 
M31. Differences in the star formation rate (SFR) for 
the galaxy regions surveyed explain the differences ob¬ 
served in blue cluster populations. The cluster sample 
from the starburst galaxy M83 corresponds to a SFR of 
1.3 Mq yr -1 (coverage fract ion of 2/ 5 applied to galaxy- 
wide SFR of 3.3 Mq yr -1 ; Boissier et al. |2005 ), while 
M33’s SFR is 0.45 Mq yr~^~~( Verley et al. ||2009D and 


the LMC’s SFR is 0.25 Mq yr M Whitney et al.||2008 ). 


Within the PHAT footprint, the current SFR is much 
lower at ~0.1 Mq yr -1 (coverage fract ion of 1 /3 applied 
to galaxy-wide SFR of 0.25 M 0 yr -1 ; Ford et aL||2013). 
Larger SFRs correlate with larger numbers of blue clus- 
ters; the relatively low number of luminous blue clusters 
in the AP catalog are a consequence of M31’s relative 
quiescent SFR. 

Next, we compare our PHAT cluster catalog to the 
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Fig. 16.— Luminosity functions across six PHAT passbands. Each plotted point represents an equal number of clusters (N = 25) and 
linear fits are made to points above the adopted completeness limit (dotted line). Top panels show results for the full AP cluster sample, 
while the bottom plots constrain the sample to those objects that lie within the disk, outside the inner bulge. 




Fig. 17.— Left: A comparison of lumin osity functions for four extragalactic samples of star cl usters: M31 (this work), M83 (|Bastian| 
|et al .12012} , M33 ( |San Roman et al.|2010] >, and the LMC (jHunt er et al.|2003| |Popescu et al.|2012|. This plot shows the relative difference 
m detect ion limits and total number of clusters for each catalog. Kigtit: A comparison ot tne number of luminous blue clusters (My < —6, 
B — V < 0.5) in each galaxy sample as a function of SFR for the region that was surveyed in each galaxy. This plot shows that the AP 
sample includes fewer luminous blue clusters due to the relatively low SFR found in the PHAT survey footprint. 


sample of known Galactic clusters. Without question, 
observations of Milky Way star clusters provide rich, de¬ 
tailed datasets for individual clusters and their member 
stars that cannot be matched in an extragalactic set¬ 
ting. The ability to measure star-by-star proper motions, 
detect and resolve stars down to the hydrogen burning 
limit, and efficiently obtain detailed abundance infor¬ 
mation through spectroscopy of individual members are 
all major advantages of studying clusters in the Galaxy. 
However, it is interesting to explore how Galactic cluster 


samples compare on galaxy-integrated scales. Our Sun’s 
position within the disk of the Milky Way, surrounded 
by obscuring gas and dust along the Galactic mid-plane, 
does not provide the optimal vantage point for observ¬ 
ing the distribution of clusters throughout our galaxy. In 
fact, we argue below that extragalactic samples provide 
a better assessment of overall cluster populations, due 
to the uniformity of selection and the ability to survey a 
wide range of galactic environments. 

The recent catalog of Milky Way clusters by 
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Kharchenko et al. (20131 contains 2547 cluster^] sim¬ 
ilar to the number of entries in the PHAT cluster cat¬ 
alog. But while the sample sizes are comparable, the 
uniformity and selection function of the Milky Way clus¬ 
ters differ significantly from the AP clusters in M31. 
The sample of Milky Way clusters is compiled from a 
heteroge nous set of liter ature sources, including earlier 
work of (Dias et al.||2002), leading to an ill-defined selec¬ 
tion function and catalog completeness that is difficult 
to chara cterize. Assuming a cons tant surface density of 
clusters, |Kharchenko et al.| (|2013|) suggest that the sam¬ 
ple is complete to a radius ol ~1.8 kpc around the sun 
thus covering an area of ~10 kpc * 1 2 * . Not only is this area 
more than an order of magnitude smaller than the phys¬ 
ical region covered by PHAT, but the surveyed region of 
Galaxy is limited to the Solar neighborhood. Most of 
the area within 1.8 kpc of the Sun lies within a Galactic 
inter-arm region, limiting the amount of on-going star 
formation and range of environments one can study. 

A ccording to esti mate s com piled in a recent review by 
Portegies^^^det al. (2010) (based on the sample of 


Dias et al.|2002|), the young (excluding globular clusters) 


iVlilky Way cluster sample includes objects that range 
in mass from 25 M 0 to 5xl0 4 * * * * 9 Mg. Within a radius 
of 1.8 kpc, the complete cluster sample includes a mass 
range that varies over <3 orders of magnitude, up to 
4000 Mq, the mass of the Orion Nebula Cluster. The 
proximity of the Milky Way clusters allows for the in¬ 
clusion of low mass objects that remain undetected in 
M31, however the accurate understanding of mass com¬ 
pleteness and catalog selection for PHAT, along with the 
number and variety of clusters included, makes the AP 
catalog the best available resource for a wide range of 
cluster science studies: cluster dissolution, mass func¬ 
tions, cluster formation efficiency, and how cluster prop¬ 
erties vary with environment. 


6.2. Catalog Differences: Year 1 & AP 

The cluster defi nitio n we use for the AP catalog, as de¬ 
scribed in Section 0 is more liberal than the one used 
in our previous Year 1 catalog. In Paper I, we excluded 
three categories of candidate clusters that we do not ex¬ 
plicitly reject from the AP catalog: 

1. Loose Associations — Defined by their lack of cen¬ 
trally concentrated stars, these objects are likely 
to be gravitationally unbound due to their large 
spatial extents and low stellar densities, and were 
therefore rejected from inclusion in the Year 1 cata¬ 
log. The AP search yielded many high-significance 
candidates that were not identified during the Year 

1 effort. 

2. Emission Line Regions — Compact, high surface 

brightness HII regions show up prominently in 

F475W imaging ([OIII] and H/3 emission lines lie 

within the F475W bandpass) and tends to enhance 

the visual appearance of associated clusters. While 

line emission on its own does not provide explicit 
evidence for or against the presence of a cluster 

(non-cluster HII regions and line emitting clusters 

9 This total excludes associations, moving groups, and remnant 
cluster classifications from the catalog’s 3006 overall entries. 


both exist), we find that cluster candidates asso¬ 
ciated with emission line flux are accepted more 
frequently into the AP catalog than by the expert- 
based Year 1 search. We document this tendency 
because it reveals a possible systematic affecting 
catalog completeness that is not captured by our 
synthetic cluster tests: low mass clusters that pro¬ 
duce line emission may be systematically overrep¬ 
resented in the AP catalog with respect to the com¬ 
pleteness function derived in Section [4] 

3. Small Clusters — While we emphasized a liberal, 
inclusive approach to cluster identification in Pa¬ 
per I, small candidate clusters were often discarded, 
with a loosely-defined limit requiring 3-4 spatially 
correlated stars to trigger inclusion in the catalog. 
For the AP search, no star count limit was ever 
discussed. 

These three categories of objects represent systematic 
differences between the Year 1 and AP catalogs. Of 
these three, the loose associations represent the most 
conspicuous difference: the number of bright blue objects 
(F336W—F475W < -0.5 and F475W < 19.75) identified 
within the Year 1 footprint more than doubled, from 
15 to 35 clusters, many of which appear extended and 
poorly concentrated. In an effort to clearly identify these 
uncertain and controversial AP clusters, we flag objects 
that match the following criteria as possible associations: 
bright (F475W < 19.75), blue (F336W-F475W < -0.5), 
and spatially extended. A cluster is characterized as spa¬ 
tially extended either through its light profile, according 
to its half-light radius, or its profile of resolved main se¬ 
quence stars, according to the radius that contains 60% 
of the cluster’s main sequence stars (7?o.6iV(MS))- We 
adopt the following criteria for spatial extension: i? e ff > 
1.05 arcsec (4 pc), or i? 0 6 JV(ms) > 0.5i? ap for stars with 
F475W—F814W < 1 and F475W < 24. The combined 
color, magnitude, and spatial extension criteria identify 
64 association-like objects; flags identifying these objects 
are included in Table [2j These extended candidates are 
the most likely examples of regions hosting spatially cor¬ 
related star formation, but where the stars may not have 
ever been gravitationally bound to one another. As such, 
one should carefully evaluate the possibility that these 
candidates may not be the young progenitors of the older 
clusters we identify in this catalog. 

7. SUMMARY 

We presented our methodology for transforming 
crowd-sourced effort into cluster catalogs for the AP- 
based analysis of the PHAT survey data. We show 
the validity of our crowd-sourced cluster identification 
methodology and show good consistency between our re¬ 
sults and expert-derived by-eye catalogs. In addition, we 
present a thorough analysis of the resulting complete¬ 
ness characteristics of our cluster catalog, an essential 
component to any study of galaxy-wide star cluster pop¬ 
ulations. Our completeness tests demonstrate that our 
PHAT cluster catalog is mass-limited and 50% complete 
to ~500 Mq up to an age of 100 Myr, at which point the 
catalog becomes luminosity-limited at F475W ^21.5. 

The final cluster catalog includes 2753 entries, span¬ 
ning more than three orders of magnitude in F475W lu¬ 
minosity. Our use of HST imaging provides access to 
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systems spanning the range from massive globular clus¬ 
ters to low-mass (< 10 3 Mq) clusters in the disk, similar 
to Milky Way open clusters. Analysis of this sample pro¬ 
vides a unique and unmatched opportunity to obtain a 
comprehensive understanding of star cluster populations 
within a large spiral galaxy. The AP catalog serves as 
the definitive base data product that will enable an array 
of stellar cluster studies within M31. 
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APPENDIX 

CATALOG CONSTRUCTION PROCEDURE 


sorted from smallest to largest radius under the as¬ 
sumption that small radii identifications will have 
the most accurate center positions. Each resulting 
candidate consists of a central position and radius, 
represented by the mean X and Y image coordi¬ 
nates and the median radius of each set of merged 
clicks. 

2. Prune Candidates — Next we prune duplicate ob¬ 
jects from the initial set of candidate objects. Here, 
a duplicate object is a candidate that corresponds 
to the same image feature as another in the ini¬ 
tial list, but the component clicks were not merged 
during the previous step. This process begins by 
iterating through the initial candidate list in or¬ 
der of decreasing /cist+gai- For each iteration, we 
define the candidate in question as the primary ob¬ 
ject, and search for secondary objects, which are 
any other initial candidates whose circular bound¬ 
ary encloses the primary’s center. If we identify 
any secondary objects with a / c ist+gai less than 
that of the primary, the secondary is dropped from 
the candidate list. If a secondary candidate has 
a higher f c i s t+ ga i than the primary, the primary is 
dropped from the candidate list. Once we’ve it¬ 
erated through all initial candidates, the result of 
this pruning procedure is a list of spatially-unique 
candidates. 

3. Re-associate Identifications with Final Candidates 

To calculate final hit-rate statistics for each 
surviving candidate, we identify all original iden¬ 
tifications where the candidate aperture encloses 
the identification center and vice versa and use 
these clicks to calculate / c ist+ g ai, /duster, and 
/galaxy values. However, candidates retain their pre¬ 
vious center and radius values. Finally, we remove 
any candidates with only one associated identifica¬ 
tion (i.e., single click candidates), while remaining 
multi-click candidates go on to join the final per- 
irnage catalog. 


AP catalog construction occurs in two phases: merging 
identifications on an image-by-image basis, followed by 
the merging of per-image catalogs into a single survey¬ 
wide catalog. Throughout this description, we use the 
terms “click” and “identification” interchangeably to rep¬ 
resent image markers placed by AP volunteers. We be¬ 
gin by describing the first phase, which consists of three 
steps: 

1. Create Candidate List — From the set of all clus¬ 
ter and galaxy identifications recorded for a given 
image, we construct a list of initial candidate ob¬ 
jects by grouping center positions using a match¬ 
ing radius of 20 pixels (equivalent to 1 arcsec or 
3.81 pc). Our choice of matching radius was tuned 
such that clicks representing the same object were 
merged together, but distinct neighboring objects 
were not merged. We observed that the position¬ 
ing of marker centers are quite precise; the distri¬ 
bution of user-determined centers for well-defined 
image features can be described as a 2D Gaussian 
with (j=2 pixels (equivalent to 0.1 arcsec or 0.4 
pc). We iterate through the list of identifications, 


We present two image examples that show how our 
catalog construction algorithm works. The top row of 
Figure |l8| shows all object identifications and their asso¬ 
ciated centers for each image. The second row shows the 
full list of merged candidates that result from the first 
step described above, where those that survive the prun¬ 
ing process are highlighted in bold. Finally, we show the 
final list of candidates that survive the candidate prun¬ 
ing and subsequent single-click cut overlaid on top of the 
field’s single-band F475W image, where the most sig¬ 
nificant detections (/ c ist+»ai is >0.1) are shown in red. 
The left column of Figure fl§| shows the B02-F11_22 sub¬ 
image, a field that consists primarily of well-separated, 
well-defined candidates. The right column shows B06- 
F16_22, which highlights a challenging case with many 
non-unique, overlapping feature identifications. 

The B06-F16_22 image example presents a particularly 
informative example of our cataloging algorithm in ac¬ 
tion. The transition from the raw identification data in 
the top panel to the initial candidate list in the mid¬ 
dle panel shows that our methodology for merging clicks 
(using a small 20 pixel matching radius) is quite conser¬ 
vative, insuring that nearby objects are not incorrectly 
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Fig. 18.— Catalog construction examples, featuring identifications from B02-F11_22 (left) and B06-F16_22 (right). Top panels: We plot 
all individual cluster and galaxy markings along with their centers. Middle panels: We plot all initial object candidates that result from 
the grouping of center positions. Gray circles represent candidates that were pruned, while black circles represent candidates that go on to 
become final catalog entries. Bottom panels: Each final candidate is shown, color coded by its final status: clusters (thick blue), galaxies 
(thick red), ancillary candidates with / c ist+ g al >0.1 (thin black), and low significance candidates with /dst+gal <0.1 (thin gray). 


combined. Next, this initial candidate list is pruned to 
remove true duplications, cutting the first set of candi¬ 
dates down to those plotted in black in the middle panel. 
This operation takes the significance of each candidate 
into account (according to / c ist+gai scores, reflecting to¬ 
tal numbers of clicks), and yields a final list of objects 
that are spatially unique. Identifications associated with 
the dropped duplicate candidates are not discarded, as 
most are re-associated during the final step of per-image 
processing. Finally, the bottom panel shows the output 


of catalog processing, showing reasonable results even 
for this complex set of inputs. While the low and mod¬ 
erate significance identifications (gray and black circles, 
respectively) are not included in the AP catalog pub¬ 
lished in Table [2j these objects are all recoverable due to 
their inclusion in the publicly available ancillary catalog 
presented in Appendix [C] 

The primary AP base data product is produced in the 
second phase of the construction process: merging per- 
image catalogs into a final survey-wide catalog. We per- 
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form this merge in a two-step process: 

1. Match Per-Image Candidates — We compile a list 
of all sub-image catalog entries, and iterate through 
each entry in order from highest / c ist+gai to low¬ 
est. During each iteration, we define the candi¬ 
date in question as the primary object, and search 
for secondary objects, which are any other can¬ 
didates whose circular boundary encloses the pri¬ 
mary’s center and vice versa. If we identify any 
secondary objects, these matches are immediately 
removed from the list. When complete, the result¬ 
ing list of surviving objects represents our final list 
of spatially-unique catalog entries. 

2. Merge Candidate Properties — To determine the 
properties of each final catalog object, consider 
each entry and its set of associated secondary en¬ 
tries. From this set of per-image objects, identify 
those that lie completely within the bounds of their 
host sub-image (whose radius is less than the dis¬ 
tance to the closest image edge) and merge their 
positions (in RA/Dec coordinates) and radii using 
the mean of their individual values, and assign fi¬ 
nal /cist+gai, /cluster, and /galaxy values and using a 
mean weighted by the number of total sub-image 
views. Excluding objects that do not fall com¬ 
pletely within their host image allows us to limit 
the influence of edge effects and biases on the final 
cataloged properties. If none of the merged per- 
image entries pass this edge criteria, we adopt the 
properties of the entry that lies furthest from an 
image edge. 

PHAT BACKGROUND GALAXY CATALOG 

To define an AP galaxy sample, we base our selection 
on a combination of /cist+gai and /galaxy criteria. Uti¬ 
lizing the /cist+gai metric allows for better recovery of 
moderate and high f c i s t+ ga i objects with / ga i axy scores 
that lie on the tail (0.3-0.8) of the distribution. Adopt¬ 
ing an /g a i aX y threshold of 0.3, as discussed in Section [3j 
we construct a completeness curve for the galaxy sample 
to choose an appropriate /cist+gai cutoff. Similar to our 
cluster analysis, we compare the AP sample to the Year 
1 galaxy sample. The Year 1 galaxy sample was not a fo¬ 
cused effort to identify all possible galaxies, therefore we 
do not categorize AP identifications that do not match 
Year 1 galaxies as contaminants, but study the behavior 
of the relative completeness fraction of these objects in 
a similar way. We do not pursue the application of user 
weights for these results, but derive a single unweighted 
curve presented in Figure [T9} 

We observe a transition in the behavior of the com¬ 
pleteness curve at a Year 1 completeness of ^0.67. The 
slope of the completeness curve becomes steeper; quan¬ 
titatively, this transition represents the point of dimin¬ 
ishing returns, where more non-Year 1 objects are being 
added to the sample than previously identified Year 1 
objects. We choose this transition point as a suitable 
limit for catalog inclusion. Therefore, we define the AP 
galaxy sample using the following selection criteria: 

/cist+gai > 0.37 AND / galaxy > 0.3 (Bl) 
These criteria select a sample of 2,270 background 



/(Year 1 Completeness) 


Fig. 19. — Top: Completeness versus contamination curve for 
galaxy sample. Bottom: Completeness versus /cist+gai cutoff val¬ 
ues. 

galaxies. The catalog is presented in Table [4] and their 
spatial distribution is shown in Figure [20} 

ANCILLARY CATALOG DATA 

Additional Candidate Catalog: In addition to the 
AP clusters presented in Tableland background galaxies 
presented in Table [4j we present Table [5] containing 8775 
other candidate object identifications with / c i s t+gai > 0.1 
that were not included in either of the other data tables. 

Synthetic Cluster Results: We present cluster-by¬ 
cluster synthetic recovery results in Table [6] to allow for 
custom completeness analyses. The table includes input 
cluster parameter information (i.e., age, mass, R e s , po¬ 
sition, etc.) as well as AP catalog metadata. 

COMMENTARY ON EXISTING CLUSTER CATALOGS 

To supplement the broad comparison to previously 
pub lished cluster catalogs that we presented in Section 
|5.2[ here we provide commentary on conflicting M31 ob¬ 
ject classifications. We summarize these differences nu¬ 
merically in Table [71 and present classification updates in 
Table! 

ARCHIVAL IMAGE SEARCH AND CATALOGS 

As part of the second round of AP data collection, 
we included additional, non-PHAT ACS images obtained 
from the HST archive. These images were obtained by 
a program (PID: 10273, PI: Crotts) that observed four 
contiguous stripes within M31, each composed of 8 side- 
by-side ACS fields. Please see Figure [2] for the locations 
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Fig. 20.— Spatial distribution of AP background galaxy catalog overlaid on the PHAT survey-wide F475W image. 


of the strips with respect to the PHAT survey footprint. 
This program utilized a F555W, F814W filter combina¬ 
tion and exposure times that are shorter than those of 
PHAT: F555W varying between 81 and 413 sec, F814W 
varying between 457 and 502 sec. We divided each of 
the 32 ACS images into 54 sub-images, yielding a total 
of 1,728 search images. 


Following the same catalog construction proced ures 
(see Section [3| and selection criteria (see Section [5T I 
used for the PHAT classification data, we construct cat¬ 
alogs of clusters and background galaxies. We present 
the cluster sample in Table [9j the background galaxy 
sample in Table 10 and compile an ancillary sample of 


all other identifications with / c i s t+gai > 0.1 in Table 11 
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TABLE 1 

User Weighting Parameters 


Name 

Detection 

^logistic ^logistic 

Non-detection 

^"logistic ^logistic 

doptimal 

Optimal 

Completeness 

Optimal 

Contamination 

/cluster 

Cutoff 

Uniform Weights 





0.1809 

0.8528 

0.1052 

0.5114 

Uniform Weights (Match Comp) 





0.1928 

0.8811 

0.1518 

0.4512 

Uniform Weights (Match Cont) 





0.1828 

0.8453 

0.0974 

0.5214 

Best Weights 

16.0 

0.6 

39.0 

1.1 

0.1543 

0.8811 

0.0984 

0.6416 


TABLE 2 

AP Cluster Catalog 


AP ID 

RA (J2000) 

Dec (J2000) 

Hap (") 

R e a (") 

^■ApCor 

F275W ap 

(7 

F336W ap 

a 

F475W ap 

a 

/clst+gal 

/galaxy 

/cluster,W 

PC ID 

Alt ID 

Flags 

F814W ap 

a 

FllOWap 

a 

F160W ap 

a 

i 

11.435516 

41.698562 

2.19 

0.60 

-0.01 

20.12 

0.04 

19.16 

0.01 

18.77 

0.01 

1.0000 

0.0000 

1.0000 


B374-G306 


17.69 

0.08 

17.19 

0.15 

16.59 

0.20 

2 

11.366514 

41.701013 

1.86 

0.61 

-0.04 

20.91 

0.10 

20.25 

0.02 

20.01 

0.10 

0.9717 

0.0083 

0.9894 


M085 


19.05 

0.21 

>18.06 


>17.06 


3 

11.471290 

42.049246 

1.95 

0.88 

-0.14 

21.27 

0.14 

20.81 

0.03 

20.67 

0.08 

1.0000 

0.0000 

1.0000 




20.07 

0.31 

>18.97 


>18.07 


4 

11.474664 

42.038351 

2^7 

0.98 

-0.05 

20.10 

0.04 

19.10 

0.02 

18775 

0.03 

1.0000 

0.0227 

0.9909 


B483-D085 


17.78 

0.08 

17.29 

0.16 

16.66 

0.21 

5 

10.991636 

41.359328 

1.46 

0.40 

-0.01 

20.88 

0.04 

20.29 

0.03 

20.09 

0.06 

1.0000 

0.0000 

1.0000 


M005 


19.36 

0.10 

18.73 

0.21 

17.72 

0.25 


NOTE. — Table^^is published in its entirety in the electronic edition of the Astrophysical Journal. A portion is shown here for guidance regarding its form and content. 
Three-sigma upper limits are denoted by a“>” symbol. PC ID and Alt ID refer to cluster identifiers froin the Year 1 catalog and other literature sources, respectively. 
Flags: E = Extended Object (see Section |6.2} ; B = Bulge or B03 object manually added (see Section |5.1| . 


Aperture Corrections are provided such that ^Total = m ap "I" m ApCor - 


TABLE 3 

Passband Photometric Quality 
Comparison for Cluster Sample 


Passband 

N( Detections) 

F275W 

1733 

(62.9%) 

F336W 

2481 

(90.1%) 

F475W 

2717 

(98.7%) 

F814W 

1871 

(68.0%) 

F110W 

1209 

(43.9%) 

F160W 

1035 

(37.6%) 

F336W+F475W 

2464 

(89.5%) 

F475W+F814W 

1867 

(67.8%) 

F336W+F475W+F814W 

1701 

(61.8%) 


TABLE 4 

AP Background Galaxy Catalog 


AP ID 

RA (J2000) 

Dec (J2000) 

Hap {") 

/clst+gal 

/galaxy 

F814W ap 

a 

8 

11.447226 

42.268672 

2.75 

1.0000 

0.9884 

18.34 

0.05 

10 

11.922144 

42.102526 

3.53 

1.0000 

1.0000 

18.53 

0.11 

20 

11.911096 

42.076717 

2.00 

0.9902 

0.9604 

20.07 

0.29 

21 

11.585498 

41.726941 

3.82 

0.9901 

0.9900 

16.04 

0.02 

22 

11.270065 

41.312829 

2.38 

0.9901 

0.9500 

18.67 

0.11 


Note. - Table |4j is published in its entirety in the electronic edition of the Astrophysical Journal. A portion 

is shown here for guidance regarding its form and content. 
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TABLE 5 

AP Ancillary Catalog 


AP ID 

/clst+gal 

RA (J2000) 

/galaxy 

Dec (J2000) 

/cluster,W 

Rap (") 

PC ID 

R e ff (") 

Alt ID 

"lApCor a 

Flags 

F275W ap 

F814W ap 

a 

a 

F336W ap 

FllOWap 

a 

a 

F475W ap 

F160W ap 

a 

a 

1706 

0.3415 

11.393493 

0.0000 

41.774981 

0.5252 

1.25 

0.36 

- 0.02 

>21.62 

>20.08 


21.23 

>19.39 

0.24 

21.21 

>18.73 

0.20 

2073 

0.7738 

11.701786 

0.1538 

41.963523 

0.6389 

1.09 

0.46 

- 0.11 

23.85 

20.95 

0.39 

0.33 

23.21 

20.09 

0.07 

0.34 

22.57 

19.16 

0.16 

0.31 

2149 

0.5294 

11.133017 

0.0000 

41.395088 

0.4364 

2.51 

1.69 

-0.38 

16.76 

>19.62 

0.04 

17.28 

>18.12 

0.15 

18.80 

>16.93 

0.04 

2486 

0.3372 

11.554857 

0.0345 

41.873578 

0.3793 

1.81 

0.23 

- 0.00 

19.33 

>19.57 

0.11 

19.23 

>18.62 

0.11 

> 20.12 

>17.77 


2532 

0.7263 

10.915584 

0.2754 

41.487991 

0.5502 

2.08 

1.08 

SK070A 

- 0.21 

21.71 

18.14 

0.34 

0.09 

20.63 

17.46 

0.17 

0.11 

20.06 

16.78 

0.16 

0.22 


NOTE. - Table |5| is published in its entirety in the electronic edition of the Astrophysical Journal. A portion is shown here for guidance regarding its form and 

content. Three-sigma upper limits are denoted by a “>” symbol. PC ID and Alt ID refer to cluster identifiers from the Year 1 catalog and other literature sources, 
respectively. 


Aperture Corrections are provided such that m T 0 tal = ra ap "I" m ApCor - 


TABLE 6 

Synthetic Cluster Results 


ID 

RA (J2000) 

log(Mass/M 0 ) 
Dec (J2000) 

log(Age/yr) 
Rap (") 

Z 

/clst+gal 

/galaxy 

Reff,in (") 
/cluster,W 

F475W in 
AT (MS) 

F814W in 
AT (RGB) 

F475W_ 3 ,in 

Detected 

1 

3.17 

7.30 

0.019 

1.612 

0.319 

20.75 

20.06 

21.07 

11.012636 

41.181335 

1.39 

0.9333 

0.0000 

0.9997 

169 

374 

T 

2 

4.51 

9.60 

0.019 

0.253 

1.397 

20.19 

18.34 

20.21 

11.003787 

41.184849 

1.99 

0.7326 

0.0159 

0.9386 

192 

418 

T 

3 

2.17 

8.10 

0.019 

0.230 

0.467 

21.18 

21.18 

22.36 

10.985614 

41.192447 

1.42 

0.5222 

0.0000 

0.6968 

145 

468 

T 

4 

3.92 

10.05 

0.0001 

1.366 

0.343 

22.92 

21.02 

23.22 

11.004518 

41.190121 

1.16 

0.1786 

0.0000 

0.1575 

200 

398 

F 

5 

4.47 

10.05 

0.004 

0.370 

0.683 

20.72 

18.82 

20.87 

10.990636 

41.195372 

1.47 

0.7126 

0.0806 

0.9198 

199 

497 

T 


Note. — Table |6J is published in its entirety in the electronic edition of the Astrophysical Journal. A portion is shown here for guidance regarding 
its form and content. 1V(MS) and IV(RGB) values are evaluated per search image, as defined in Section 


TABLE 7 

Summary of Existing Cluster Catalog Classifications and 
Revisions 


Catalog I Clusters Candidates 3, Non-Cluster b 

Accepted as AP Cluster (# Rejected as Not AP Cluster) 


Year 1 

532 (69) 

95 (142) 

39 

RBC 

232 (2) 

28 (18) 

40 

Caldwell 

232 (10) 


8 

HKC 

156 (57) 




a The “candidate” classification refers to PHAT Year 1 possible cluster 
table and RBC flags 2 and 3. 

b The “non-cluster” classification refers to galaxy, star, HII region, and 
other non-cluster classifications. 


TABLE 8 

RBC Flag Revisions and Commentary 


AP ID 

RBC Name 

New Flag 

Old Flag Comments 

55 

SK102A 

1 

6 

68 

SK213B 

1 

2 

77 

SK182B 

1 

6 

89 

M065 

1 

2 

91 

M004 

1 

6 


Note. — Table [8] is published in its entirety in the elec¬ 
tronic edition of the Astrophysical Journal. A portion is 
shown here for guidance regarding its form and content. 
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TABLE 9 

Archival AP Cluster Catalog 


AAP ID 

' RA (J2000) 

Dec (J2000) 

Ra P (") 

/clst+gal 

/galaxy 

/cluster, W 

Alt ID 

1 

10.522610 

41.435868 

2.42 

0.9903 

0.0294 

0.9730 

B069-G132 

2 

10.541597 

40.907603 

2.82 

0.9804 

0.0000 

0.9905 


8 

10.509294 

40.896004 

1.97 

0.9800 

0.0408 

0.9908 

SK041A 

9 

10.753904 

41.656852 

1.42 

0.9126 

0.0000 

0.9915 


11 

10.521275 

40.885136 

1.66 

0.9712 

0.0594 

0.9828 


Note. — 
here for gui 

Table |9j is published in its entirety in 
dance regarding its form and content. 

l the electronic edition of 

the Astrophysical Journal. A 

portion is shown 


TABLE 10 

Archival AP Background Galaxy Catalog 

AAP ID 

RA (J2000) 

Dec (J2000) 

Rap (") 

/clst+gal 

/galaxy 

3 

10.846248 

41.040394 

1.99 

0.9804 

0.8900 

4 

10.536274 

41.444508 

5.10 

0.9802 

0.9596 

5 

10.595225 

40.953062 

2.44 

0.9802 

0.9192 

6 

10.465582 

41.411634 

4.50 

0.9802 

0.9495 

7 

10.463831 

41.416378 

4.79 

0.9802 

0.9697 

NOTE. — Table! 10|is published 

in its entirety in th 

e electronic 

edition of the 

A strophysical 

Journal. A pc 

irtion is shown here 

for guidance regarding its fori 

tl and content 



TABLE 11 

Archival AP Ancillary Catalog 

AAP ID 

RA (J2000) 

Dec (J2000) 

Rap (”) 

/clst+gal 

/galaxy 

/cluster,W 

Alt ID 

238 

10.491200 

41.439039 

1.01 

0.7788 

0.2716 

0.6317 

KHM31-357 

334 

10.750010 

41.001666 

1.35 

0.6923 

0.1389 

0.5875 

KHM31-453 

389 

10.651615 

41.552907 

1.36 

0.6400 

0.2188 

0.5784 


399 

10.546207 

41.509361 

1.41 

0.6341 

0.0385 

0.6197 

KHM31-367 

400 

10.950410 

41.192429 

1.36 

0.6337 

0.2031 

0.5965 



Note. — Table E3 is published in its entirety in the electronic edition of the Astrophysical Journal. A portion is shown 
here for guidance regarding its form and content. 















